Resource-Efficient, Hierarchical Auto-Tuning of a Hybrid Lattice Boltzmann Computation on the Cray XT4

نویسندگان

  • Samuel Williams
  • Jonathan Carter
  • Leonid Oliker
  • John Shalf
  • Katherine Yelick
چکیده

We apply auto-tuning to a hybrid MPI-pthreads lattice Boltzmann computation running on the Cray XT4 at National Energy Research Scientific Computing Center (NERSC). Previous work showed that multicorespecific auto-tuning can improve the performance of lattice Boltzmann magnetohydrodynamics (LBMHD) by a factor of 4× when running on dualand quad-core Opteron dual-socket SMPs. We extend these studies to the distributed memory arena via a hybrid MPI/pthreads implementation. In addition to conventional auto-tuning at the local SMP node, we tune at the message-passing level to determine the optimal aspect ratio as well as the correct balance between MPI tasks and threads per MPI task. Our study presents a detailed performance analysis when moving along an isocurve of constant hardware usage: fixed total memory, total cores, and total nodes. Overall, our work points to approaches for improving intraand inter-node efficiency on large-scale multicore systems for demanding scientific applications.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Impact of Quad-Core Cray XT4 System and Software Stack on Scientific Computation

An upgrade from dual-core to quad-core AMD processor on the Cray XT system at the Oak Ridge National Laboratory (ORNL) Leadership Computing Facility (LCF) has resulted in significant changes in the hardware and software stack, including a deeper memory hierarchy, SIMD instructions and a multi-core aware MPI library. In this paper, we evaluate impact of a subset of these key changes on large-sca...

متن کامل

Auto-Tuning Distributed-Memory 3-Dimensional Fast Fourier Transforms on the Cray XT4

We present auto tuning, optimization, and performance modeling of 3 Dimensional Fast Fourier Transforms on Cray XT4 (Franklin) system. Spectral methods involving FFTs are a commonly used numerical technique with applications in engineering, chemistry, geosciences, and other areas of scientific computing. In the case of materials science the wavefunction of the electrons are expanded in spatial ...

متن کامل

Future Proof Parallelism for Electron-Atom Scattering Codes on the XT4

Electron collisions with atoms were among the earliest problems studied using quantum mechanics. However, the accurate computation of much of the data required in astrophysics and plasma physics still presents huge computational challenges, even on the latest generation of high-performance computer architectures, such as the Cray XT series. In recent years a suite of parallel programs based on ...

متن کامل

External and Internal Incompressible Viscous Flows Computation using Taylor Series Expansion and Least Square based Lattice Boltzmann Method

The lattice Boltzmann method (LBM) has recently become an alternative and promising computational fluid dynamics approach for simulating complex fluid flows. Despite its enormous success in many practical applications, the standard LBM is restricted to the lattice uniformity in the physical space. This is the main drawback of the standard LBM for flow problems with complex geometry. Several app...

متن کامل

Using Processor Partitioning to Evaluate the Performance of MPI, OpenMP and Hybrid Parallel Applications on Dual- and Quad-core Cray XT4 Systems

Chip multiprocessors (CMP) are widely used for high performance computing. While this presents significant new opportunities, such as on-chip high inter-core bandwidth and low inter-core latency, it also presents new challenges in the form of inter-core resource conflict and contention. A challenge to be addressed is how well current parallel programming paradigms, such as MPI, OpenMP and hybri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009